Spectral cross-correlation features for audio indexing of broadcast news and meetings
نویسندگان
چکیده
This paper describes the effect of three new acoustic feature parameters to detect audio source segments that are based on spectral cross-correlation: spectral stability, white noise similarity, and sound spectral shape. These parameters are devised for accurate audio source detection and are used in a pre-processing module for automatic indexing of the broadcast news and the meetings. We conducted two audio source classification experiments: one with the broadcast news and the other with the meetings. The experiment with the broadcast news shows that proposed parameters make it possible to capture the audio sources more accurately than can be done with conventional parameters. Classification performance is increased by 6.6% when the proposed parameters are used. The spectral stability is proved to be the most effective among the conventional and the three proposed parameters. Regarding the experiments with the meeting corpus, we conducted speaker identification in addition to the audio source classification. First, the audio source classification procedure detects each sound source segment. Then, a speaker identification procedure finds cross-talk from other participants, and determines her/his own speech period. Speaker identification performance is increased by 2.7% when the proposed parameters are used.
منابع مشابه
Feature Selection for Trainable Multilingual Broadcast News Segmentation
Indexing and retrieving broadcast news stories within a large collection requires automatic detection of story boundaries. This video news story segmentation can use a wide range of audio, language, video, and image features. In this paper, we investigate the correlation between automatically-derived multimodal features and story boundaries in seven different broadcast news sources in three lan...
متن کاملAudio Hot Spotting And Retrieval Using Multiple Features
This paper reports our on-going efforts to exploit multiple features derived from an audio stream using source material such as broadcast news, teleconferences, and meetings. These features are derived from algorithms including automatic speech recognition, automatic speech indexing, speaker identification, prosodic and audio feature extraction. We describe our research prototype – the Audio Ho...
متن کاملA System for Speaker Detection and Tracking in Audio Broadcast News
A system for speaker-based audio-indexing and an application for speaker-tracking in broadcast news audio are presented. The process of producing an indexing information in continuous audio streams based on detected speakers is composed of several tasks and is therefore treated as a multistage process. The main building blocks of such an indexing system include components for an audio segmentat...
متن کاملA system for the retrieval of Italian broadcast news
This paper presents a prototype for the retrieval of Italian broadcast news, which has been developed at ITC-irst. The architecture employs a speech recognition engine for the automatic transcription of audio news. Moreover, it features document indexing based on part-of-speech tagging of text coupled with morphological analysis, and query expansion exploiting the Italian WordNet thesaurus. Que...
متن کاملAn Analysis of Sentence Segmentation Features for Broadcast News, Broadcast Conversations, and Meetings
Information retrieval techniques for speech are based on those developed for text, and thus expect structured data as input. An essential task is to add sentence boundary information to the otherwise unannotated stream of words output by automatic speech recognition systems. We analyze sentence segmentation performance as a function of feature types and transcription (manual versus automatic) f...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
دوره شماره
صفحات -
تاریخ انتشار 2005